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Abstract —For general multi-hop queueing networks, delay 
optimal network control has unfortunately been an outstanding 
problem. The dynamic backpressure (BP) algorithm elegantly 
achieves throughput optimality, but does not yield good delay 
performance in general. In this paper, we obtain an asymptotically 
delay optimal control policy, which resembles the BP algorithm 
in basing resource allocation and routing on a backpressure 
calculation, but differs from the BP algorithm in the form of 
the backpressure calculation employed. The difference suggests 
a possible reason for the unsatisfactory delay performance of 
the BP algorithm, i.e., the myopic nature of the BP control. 
Motivated by this new connection, we introduce a new class 
of enhanced backpressure-based algorithms which incorporate a 
general queue-dependent bias function into the backpressure term 
of the traditional BP algorithm to improve delay performance. 
These enhanced algorithms exploit queue state information beyond 
one hop. We prove the throughput optimality and characterize the 
utility-delay tradeoff of the enhanced algorithms. We further focus 
on two specific distributed algorithms within this class, which have 
demonstrably improved delay performance as well as acceptable 
implementation complexity. 

Index Terms —dynamic backpressure algorithms, congestion 
control, delay optimal control, throughput optimal control, dy¬ 
namic programming, Lyapunov drift. 


I. Introduction 


With the significant increase in demand for real-time ser¬ 
vices, it is well recognized that networks must be jointly 
optimized across the physical, medium access control (MAC), 
and network layers to support delay-sensitive applications. 
Delay optimal network control for general multi-hop queueing 
networks, which seeks to minimize some function of average 
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delay (or average queue size) by incorporating resource alloca¬ 
tion and routing across different layers, has unfortunately been 
an outstanding problem for some time. Often, even the basic 
structural properties of the delay optimal control policy are not 
known. While dynamic programming represents a systematic 
approach for delay optimal control, there generally exist only 
numerical solutions [1]—[5]. These solutions do not typically 
offer many design insights and are usually impractical for 
implementation in large-scale multi-hop networks, due to the 
curse of dimensionality [6]. 

A notable success in networking research is the formulation 
of the throughput optimal network control problem and its 
solution via the dynamic backpressure (BP) algorithm [7], [8]. 
Throughput optimal control seeks to ensure the stability of 
general multi-hop queueing networks (all queue sizes remain 
finite for all time) for any arrival rate vector within the network 
stability region. The BP algorithm is obtained using Lyapunov 
drift techniques. It incorporates resource allocation and routing 
across the physical, MAC, and network layers, and elegantly 
achieves throughput optimality via load balancing [7], [8]. The 
algorithm has also been combined with flow control in the 
transport layer to yield maximum network utility when the 
data arrival rate is outside the network stability region [8]. One 
major shortcoming of the BP algorithm, however, is that it does 
not yield good delay performance in general. In routing packets, 
the BP algorithm typically explores all possible paths between 
sources and destinations (i.e., load balancing over the entire 
network), without explicitly considering delay performance. 
This extensive exploration is essential for maintaining stability 
when the network is heavily loaded. Under light or moderate 
loading, however, packets may be sent over unnecessarily long 
routes, which leads to excessive delays. 

For any arrival rate vector within the network stability 
region, the delay optimal control minimizes average delay 
(average queue size), while the BP algorithm ensures finite 
average queue size and typically has good delay performance 
only under heavy load. Therefore, two interesting questions 
are: (1) whether there is any subtle connection between the 
two network control solutions and (2) what accounts for the 
delay performance gap between them. Better understanding 
of these two questions may motivate the design of enhanced 
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BP algorithms with improved delay performance. There are 
several potential challenges toward this direction. First, it is 
not clear how one should improve the delay performance of 
the BP algorithm by approximating the delay optimal control 
in a tractable manner, in order to avoid the prohibitively 
high complexity of dynamic programming. Second, it is not 
clear how to maintain the desirable throughput optimality of 
the BP algorithm when the BP control structure is modified 
for improving the delay performance. In this paper, we shall 
address the above questions and challenges. 

A. Main Contributions 

We first study the connection between delay optimal network 
control and the BP algorithm (throughput optimal network 
control). Using dynamic programming and Taylor’s theorem, 
we obtain an asymptotically delay optimal control policy when 
the scheduling slot duration is small. Surprisingly, we show 
that the asymptotically delay optimal control, obtained using 
dynamic programming, shares striking similarities with the BP 
algorithm, obtained using Lyapunov drift techniques. Specif¬ 
ically, the two algorithms both base resource allocation and 
routing on a backpressure calculation, but differ in the form of 
the backpressure calculation employed. In the BP algorithm, 
the backpressure of a link is derived from the differences of 
queue lengths at the two end nodes of the link. Thus, the 
BP backpressure term reflects local queue state information 
(QSI). In the asymptotically delay optimal control algorithm, 
the backpressure of a link is derived from the differences of 
the derivatives of the value function of the dynamic program 
at the two end nodes of the link. Since in general the value 
function depends on the global QSI, the backpressure term 
for the asymptotically delay optimal control is a function of 
the global QSI. This observation suggests a possible reason 
for the poor delay performance of the BP algorithm, i.e., the 
myopic nature of the control, which relies only on one-hop 
queue size differences. To the best of our knowledge, this is 
the first work which provides an analytical connection between 
the two network control solutions. 

Motivated by the above connection, we design enhanced BP 
algorithms with improved delay performance via the use of 
QSI beyond one hop. Specifically, we present a new class 
of enhanced BP algorithms which maintain a generalized 
notion of throughput optimality while exhibiting significantly 
improved delay performance, relative to the traditional BP 
algorithm. In lightly or moderately loaded networks, where the 
delay performance of the traditional BP algorithm is poor, the 
enhanced BP algorithms reduce average delay by (1) exploiting 
the margin between the arrival rate vector and the boundary of 
the network stability region, and (2) making use of QSI beyond 
one hop in a simple and flexible manner, via the incorporation 


of a QSI-dependent bias function into the backpressure calcu¬ 
lation. We propose two specific algorithms, named BPnxt and 
BPmin, within this class of enhanced BP algorithms. These 
two algorithms promise to improve delay performance by using 
downstream QSI to clarify congestion patterns, while allowing 
for distributed implementation with manageable complexity. 
BPnxt has the same implementation complexity (in order) as 
the traditional BP algorithm. BPmin has an implementation 
complexity which is higher (in order) than that for the tra¬ 
ditional BP algorithm but lower than that for other BP-based 
control algorithms with similar delay performance. Next, the 
delay performance of both BPnxt and BPmin can be improved 
further by incorporating an extra QSI-independent shortest path 
bias term into the backpressure calculation. Finally, we present 
a new class of enhanced joint flow control and BP algorithms 
for the case where the traffic arrival rate is outside the network 
stability region, and demonstrate their superior utility-delay 
performance tradeoff. 

B. Related Work 

A number of previous papers have focused on improving 
the delay performance of the traditional BP-based algorithms. 
References [9] and [10] improve the delay by incorporating the 
shortest path (in terms of the number of hops) concept to avoid 
the extensive exploration of paths in the BP algorithm. Specif¬ 
ically, in [9], a (constant) shortest path bias, parameterized by 
a per-link cost B, is added to the backpressure term so that 
nodes are inclined to route packets toward their destinations 
using shorter paths. The algorithm proposed in [9] is called 
BPbias here. In [10], a joint traffic-splitting and shortest-path- 
aided BP routing algorithm, called BPSP here, is proposed, 
where the traffic splitting is parameterized by K. A hop- 
queue structure is used. The algorithm incorporates the shortest 
path concept by minimizing the average number of hops 
between sources and destinations, using the hop-queue length 
difference in the backpressure term. The traditional BP and 
BPbias algorithms require 0(N 2 C) computational complexity 
for the backpressure calculation in each slot, where N and 
C are the number of nodes and the number of commodities 
in the network, respectively. The BPSP algorithm, on the 
other hand, requires 0(N 4 C) computational complexity for the 
backpressure calculation in each slot. As shown in Section VII, 
one potential challenge for BPbias and BPSP is that their delay 
performance relies heavily on the choices for parameters B 
and K. B and K must be selected for particular levels of 
traffic loading, which may be difficult to predict beforehand 
in practical networks. 

Reference [11] improves the delay of the traditional BP 
algorithm by introducing redundant traffic and a duplicate 
queue structure with finite buffers, to avoid delay increase 
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due to low queue occupancy. The one-hop (finite and local) 
duplicate queue length difference is added to the backpressure 
term, thereby capturing limited congestion information in the 
network. In [12], a shadow queueing architecture is proposed to 
improve the delay performance of the traditional BP algorithm 
by reducing the end-to-end queue length difference. The one- 
hop shadow queue length difference is used as the backpressure 
term. In [13], a delay-based BP algorithm is proposed to 
improve the delay performance using a new delay metric. The 
proposed algorithms in [12] and [13], however, are designed 
for networks in which the routes of each flow are fixed before 
the arrival of packets. Reference [14] improves the order of the 
utility-delay tradeoff by forming a virtual backlog process and 
using the LIFO service discipline on top of the traditional BP- 
based algorithm. The order improvement, however, hinges on 
the ability to drop a certain fraction of packets. In [15], [16], 
receiver diversity is used to improve the delay performance 
of the traditional BP algorithm for networks with unreliable 
channel conditions. 

The motivation, method and design for our proposed class of 
enhanced BP algorithms are novel and differ significantly from 
the algorithms proposed in the above papers. Our motivation 
stems primarily from the connection we establish between 
delay optimal network control and the throughput optimal 
BP algorithm. Our proposed class of enhanced BP algorithms 
improve delay performance by incorporating a general QSI- 
dependent bias function into the backpressure calculation, 
thereby mitigating the myopia of the traditional BP algorithm. 

II. Network Model 

In this section, we establish the network model. Consider a 
(wireline or wireless) multi-hop network modeled by a directed 
graph Q = (TV, £), where AT and C denote sets of N nodes 
and L directed links, respectively. Time is slotted with slots 
indexed by t £ {0,1,2, ■••}. The slot duration is A > 0 
(sec). Data entering the network is associated with a particular 
commodity. Let C represent the set of C commodities in the 
network. Assume that there is one destination node dest(c) for 
each commodity c £ C. Let An\t )A > 0 denote the amount 
of exogenous arrivals (bits) for commodity c to node n during 
slot t, assumed to enter the transmission buffer at the end of 
slot t. Assume that A„\t) £ [0,A^ m ax] is i.i.d. with respect 
to (w.r.t.) t with arrival rate A^ = E[A^(f)] (bit/sec), where 
Aniinax < oo. In addition, assume that processes {An\t)} for 
different node-commodity pairs are mutually independent. Let 
A(t) = {A^if)) and A A (aL c) ). 

Let 5(f) £ S denote the topology state of the network in 
slot f, where S is the finite topology state space. The topology 
state 5(f) can be used to model channel fading in wireless 
networks. Assume 5(f) is i.i.d. w.r.t. f. Let /(f) £ I denote 


the resource allocation action at slot f, where X is the bounded 
resource allocation action space. The resource allocation action 
/(f) may reflect a set of power allocations or a set of conflict 
constraints in wireless networks. Let R ab (5(f), /(f)) > 0 
denote the transmission rate (bit/sec) over link (a, b ) under 5(f) 
and /(f), where R ab (5(f), /(f)) = 0 if ( a,b ) fL C. Assume 
Rab (<S(f), /(f)) < Umax for all (a, &) £ A, 5(f) £ S and 
/(f) £ X, where /? max < oo is an upper bound on the maximum 
transmission rate over any link. Let v'^ (f) A > 0 represent the 
amount of commodity c data (bits) delivered over link (a, b) 
during slot f, satisfying: 

£>$(*) <Rab (5(f), /(f)), V(a, b) £ £, c £ C (1) 

cec 

^b(t)=0, V(a,6)££( c \ c£C (2) 

where C (r> is the set of lJ Cj links that are allowed to transmit 
commodity c data. Let 1Z denote the bounded routing action 
space, which is the bounded set of non-negative v(t) = 
( v ab(t)) satisfying (1) and (2), for all 5(f) £ S and /(f) £ I. 

Data corresponding to different commodities are queued sep¬ 
arately at each node, in buffers of infinite size. Let Un\t ) > 0 
denote the amount of commodity c data (bits) at node n at the 
beginning of slot f in the network layer. Let U(f) = ( Urf' > (f)) £ 
U denote the network layer queue state information (QSI) at 
the beginning of slot f, where U denotes the nonnegative QSI 
state space. Any data successfully delivered to its destination is 
assumed to exit the network layer. Thus, for each commodity 
c £ C, we set Un\t) = 0 for all f, if node n is the destination 
node of commodity c. For each commodity c £ C and node 
n £ A f, n ^ desf(c), the network queue dynamics satisfies: 1 

U^(t + 1) (3) 

=ui c) (t) - £ v%>{t)A + 4 C >(t)A + £ v£(t )A 

for all f. Note that (t)A bits are removed from 

the buffer at node n for commodity c before An\t) A and 
Eae m v an {t) A bits arrive. Thus, for all f, we require: 

£ (t )A < Ui c) (f), Vn £ M, c £ C. (4) 

b£j\f 

In the following, we introduce some basic definitions. 

Definition 1: A feasible stationary policy w : SxU —>• IxlZ 
is a mapping from the system (topology and QSI) state space 
to the system (resource allocation and routing) action space. 

’Due to the constraint in (2), the summations in (3) can be written as 
summations over all node indices. Note that we assume exogenous arrivals 
during a slot arrive into the transmission buffer at the end of the slot. 
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Given system state (s, u) G 5 x U, uj determines the action 
(/, iz) = u(s, u) g I x K, where I and iz satisfy Igl and 

V(a,6)GA cGC (5) 

cec 

v$> = 0,V{a,b)#£M, c€C (6) 

X] A ^ w n c) > VngJV, c G C. (7) 

beAf 

Next, we define the notions of network stability and the 
network stability region. 

Definition 2 (Network Layer Queue Stability): A single net¬ 
work layer queue is stable if limsup,^ +oc | ^*~QlE[C/^ c ^(r)] 
< oo. A network of network layer queues is stable if all 
individual network layer queues of the network are stable. 

Definition 3 (Network Stability Region): The network sta¬ 
bility region A is the closure of the set of all arrival rates A 
for which all network layer queues can be stabilized by some 
feasible policy. 2 

III. Connection between Throughput Optimal BP 

ALGORITHM AND DELAY OPTIMAL CONTROL 


w.r.t. the probability measure induced by the control policy w. 
Problem 1 is an infinite horizon average cost problem [6]. In the 
remainder of this section, we restrict our attention to stabilizing 
policies. 


B. Delay Optimal Policy 

In Sections III-B, III-C and III-D, for technical tractability, 
we assume An' 1 (t) and ( t ) both take on nonnegative ratio¬ 
nal values for all t, A is a nonnegative rational number, and X 
and TZ are finite. These assumptions ensure that U is countably 
infinite. Under certain conditions (specified below), a delay 
optimal policy to* can be obtained by solving the following 
Bellman equation. 

Lemma 1 (Bellman Equation): Assume 3 that a scalar d and 
a real-valued function V(-) solve the Bellman equation: 


d + V( u) = 
5>r[S = S ] 

s£<S 



' 


min < 

I.u 

E U n )+ E ^,u),u'(i>)V>') 

> 

[ 

n£j\f 

k cec J 



( 10 ) 


In this section, we consider the case where A G int(A) 
and study the connection between the throughput optimal BP 
algorithm and the delay optimal control. 

A. Delay Optimal Control Problem 

Under a given stationary policy w, the induced random pro¬ 
cess {( S(t ), U(f))} is a controlled markov chain with transition 
probabilities given by: 

^ , (s.u),(s / .u / ) 

= Pr l(S(t + 1), U(t + 1)) = (s', u')|(5(t),U(t)) = (s, u), I, v] 
= Pr[5(f + l)=s']P (SiU)x (J, I z) (8) 

where P( S , U ), U '(J, v) = Pr[U(f + 1) = u'|(5(t), U(t)) = 
(s, u), J, iz]. Note that P( s ,u.),(s',u l ){Ij iy ) denotes the proba¬ 
bility that the next state will be (s', u') G S x U given 
that the current state is (s, u) G S x U and the control 
action is (I, u) = w(s,u). In addition, P( s , u ).u'(Ii u ) = 

X/s'eS -P(s,u),(s',u')(-^> v ). 

We now formulate the delay optimal control problem. 
Problem 1 (Delay Optimal Control Problem): 

1 4-1 

min lim sup - E[U^ c \t)] (9) 

LJ f ry, t ^^ ^^ 

t —0 nGA^jCGC 

where (5(0),U(0)) G S x U and u: is a feasible stationary 
policy defined in Definition 1. Note that the expectation is taken 

- Here, the feasible policy is not required to be stationary. 


for all u G U and furthermore V(■) satisfies: 

lim iE[U(U(f)|(5(0),U(0)) = (s,u),w] = 0 (11) 

£—>■00 t 

for all lo and (s, u) G 5 x U, where I and iz in 
(10) satisfy I G I, (5), (6) and (7). Then, d = 

min^limsupt^ \ Er=o EnGAf.cec E [ t/ « C) ( r )] is the opti¬ 
mal value to Problem 1 for all initial (5(0),U(0)) G S x U 
and V{■) is called the value function (potential function). 
Furthermore, if 4 

w*(s,u) = argmin ^ P( s , u) ,u'(I, ^)V(u') (12) 

for all (s, u) G 5 x U, where I and v in (12) satisfy I £ I, 

(5) , (6) and (7), then u>* is the delay optimal policy achieving 
the optimal value d. 

Proof: Please refer to Appendix A for the proof. ■ 

The fact that the optimal value d does not change with 
(5(0), U(0)) implies that the optimal policy uj* is a unichain 
policy [6]. From Lemma 1, we can see that oj* given by (12) 
depends on u through the value function V {■). Obtaining V (■) 
involves solving the Bellman equation in (10) for all u G U, 
which does not admit a closed-form solution in general. Brute 
force numerical solutions such as value iteration and policy 
iteration are not practical for multi-hop queueing networks [1], 

[6] and do not yield many design insights. 

Assumption 4. 6. 2 and Assumption 4.6.3 in [6] provide the conditions for 
the existence of d and V (■ ). 

4 Note that V (u) captures the average delay cost stalling from u. 
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C. Asymptotically Delay Optimal Policy 

In Sections III-C and III-D, for technical tractability, we 
suppose that for all slot durations A > 0, a scalar d and a 
real-valued function V(-) exist in Lemma 1, and V(-) is twice 
differentiable. Let V', -,(•) = d )'\ (•). 

It is difficult to directly study the features of delay optimal 
control policy for a general multi-hop network by investigating 
the properties of the value function V(-). We therefore study the 
asymptotic features of the delay optimal control. Specifically, 
using Taylor’s theorem for V(-) [17], we can show that mini¬ 
mizing the expected value function in the delay optimal policy 
(cf. (12)) is the same as maximizing the gradient descent of 
the value function (cf. (13)), as A —> 0 (Please see Appendix 
B for the proof). Consider the policy 

w f (s,u) 

= arg max ^ J2"ab( V a,(c)( u ) ~ V b',(c)( u )) ( 13 ) 

(a,b)e£c€C 

for all (s,u) £ S x U, where I and v in (13) satisfy 
I £ X, (5), (6) and (7). Suppose that policy uy is a 
unichain policy. Denote the average queue length under u;' by 

dt = limsup t ^ 00 fX;T= 1 oE n e^ 1 c€C i: [ l7 n C) ( T )]- where the 

expectation is taken w.r.t. the probability measure induced by 
u A Note that u A d\ w* and d are all functions of the slot 
duration A, as d and V(-) in Lemma 1 depend on A through 
the transition probabilities. Therefore, we also write dd and d 
as A (A) and d( A), respectively, when studying the scaling 
behavior w.r.t. A. 

Next, we show that A’ is asymptotically delay optimal. 

Lemma 2 (Asymptotically Delay Optimal Policy): For all 
slot durations A > 0, suppose policy A is a unichain policy. 
Then, we have A (A) — d( A) = o(A) as A —> 0. 

Proof: Please refer to Appendix B for the proof. ■ 

Note that by Lemma 2, we have A (A) — d{ A) —> 0 as 
A —> 0. This shows that policy A is asymptotically delay 
optimal when the scheduling slot duration is small. It can be 
easily shown that policy A chooses resource allocation and 
routing action according to the following corollary. 

Corollary 1 (Asymptotically Delay Optimal Policy): Given 
the observed (s, u) £ S x U, let A and A be defined as in 
(15) and (16). If A satisfies (7), then A chooses the resource 
allocation and routing action A(s,u) = (A,iA). 

Resource Allocation: For each link (a, b) £ C and com¬ 
modity c £ C, let 

^ C) (u) = K(o)(u)-K,(c)(u) (14) 

denote the asymptotically delay optimal back¬ 
pressure of link (a, b ) w.r.t. commodity c. Let 

4b( u ) - arg max cg{c . (Qjfe)g £ ( c) } SV^ (u) and 


^(u) - (^ b(u)) ( u )) . where {x)+ = max{x,0}. 
Choose the resource allocation action as: 

P = arg max E SV} b (u)R ab (S, I). (15) 

6 ( a,b)ec 

Routing: For each link (a, b) £ C and commodity c £ C, 
choose the routing action according to: 

(c)t / Rab(S, /l), 5V^ b ( u) > 0 and c = 4 6 (u) 

1 0, otherwise. 


I). Connection to the BP Algorithm 

We now discuss the connection between the asymptotically 
delay optimal policy and the throughput optimal BP algorithm. 
By Corollary 1, we can see that the asymptotically delay 
optimal control, obtained using dynamic programming, shares 
striking similarities with the BP algorithm [7], [8], obtained us¬ 
ing Lyapunov drift techniques. Specifically, the two algorithms 
both base resource allocation and routing on a backpressure 
calculation. On the other hand, the two algorithms differ in 
the form of the backpressure calculation employed. In the BP 
algorithm, the backpressure of link (a, b ) w.r.t. commodity 
c is derived from the difference between the queue lengths 
of commodity c at the two end nodes a and b of the link, 
i.e., u a — u b ■ I' 1 ^e asymptotically delay optimal control 
algorithm, the backpressure of link (a, b ) w.r.t. commodity 
c is derived from the differences of the derivatives of the 
value function at the two end nodes a and b of the link, i.e., 

Kicp^KP*)- 

The following lemma summarizes the property of the value 
function. First, we introduce the BP control ufi as follows: 

w t (s,u)=argmax ^ E v ab ( u i c) - P) ( 17 ) 

(a,b)£C cGC 

for all (s, u) £ S x U, where I and v in (17) satisfy I £ X, 
(5), (6) and (7). We know that the traditional BP algorithm [7], 
[8] is closely related to oA 5 

Lemma 3 (Comparison between A and up): For some n £ 
M and c £ C, there does not exist a function gn\un), such 
that K,(c)( U ) = P (un^). Furthermore, there exists (s,u) £ 
S xW such that <A(s, u) f A(s, u). 

Proof: Please refer to Appendix C for the proof. ■ 

Lemma 3 shows that the backpressure term for the asymptoti¬ 
cally delay optimal control is in general a function of the global 
QSI. On the other hand, the BP backpressure term reflects local 

5 The BP algorithm does not consider the constraint in (7), as (7) is 
automatically satisfied when queue lengths are large and does not matter when 
dealing with throughput optimality. 
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QSI. Therefore, the asymptotically delay optimal control and 
the traditional BP algorithm are significantly different. 

Suppose that policy oj$ is a unichain policy. De¬ 
note the average queue length under oj$ by (P = 

limsup^o 7Et= 1 oEneA^ 1 cec E [ [/ « C) (' r )]’ where the ex P ec ' 

tation is taken w.r.t. the probability measure induced by uA We 
also write dp as dH A). 

Lemma 4 (Comparison between cP and A ): For all slot du¬ 
rations A > 0, suppose policy w' and policy u>* are unichain 
policies. Then, there exists e > 0 such that rft(A) — dP (A) > 
e + o(A) as A —>• 0. 

Proof: Please refer to Appendix D for the proof. ■ 

Lemma 4 shows that the BP algorithm has worse delay 
performance than the asymptotically delay optimal control uA 
By Lemma 2 and Lemma 4, we have d*(A) — d(A) > e+o(A). 
Thus, rfi(A) — d( A) -ft 0 as A —t 0. Therefore, the BP 
algorithm is not asymptotically delay optimal. 

Lemma 3 and Lemma 4 suggest a possible reason for the 
generally unsatisfactory delay performance of the BP algorithm. 
Namely, the BP delay performance suffers from the myopic 
nature of the control, which relies only on one-hop queue size 
differences. To the best of our knowledge, this is the first work 
which provides an analytical connection between delay optimal 
network control and the BP algorithm (throughput optimal 
network control). The connection provides a theoretical basis 
for designing enhanced BP algorithms with improved delay 
performance via the use of QSI beyond one hop. Motivated by 
this connection, we shall present a new class of enhanced BP- 
based algorithms, for the cases where A G int(A) and A f A 
in Sections IV and VI, respectively. 

IV. Enhanced BP Algorithms with 
Stabilizable Arrival Rates 

In this section, we consider the case where A G int(A), and 
develop a new class of enhanced BP algorithms. 

A. Network Layer Queue Dynamics 

When A G int(A), the arrival data can be directly admitted 
to the network. Let /A (i) A > 0 represent the amount of 
commodity c data (bits) which can be transmitted over link 
(a, h) during slot t. Let p(t) = (/j^ (£)). Similar to u(t), p(t) 
also satisfies (1) and (2) (in terms of /.A (t) instead of v ( P (£)). 
Unlike u(f), p,(t) does not have to satisfy (4) (in terms of 
Bab (t) instead of (f)). In other words, for each node n G Af 
and commodity c G C, ( t ) = i>P (t) for all b G Af when 

there are enough bits to be removed from the buffer at node 
n for commodity c, i.e., (4) is satisfied (in terms of p'2 ( t ) 
instead of ufj (£)). Thus, for each commodity c G C and node 


n G Af, n f dest(c ), the network queue dynamics satisfies: 

U^Xt + 1) (18) 

< (up c t ) - e m) + 4 c) m+Y, m- 

V beN ) a£N 

Inequality holds in (18) because the actual amount of com¬ 
modity c data arriving to node n during slot t may be less 
than YPaeN Ban (t)A if the neighboring nodes have little or 
no commodity c data to transmit. To facilitate the design of 
throughput optimal control, we consider routing control in 
terms of p{t) instead of v(t), as in [7], [8]. 

B. Bias Function 

Motivated by the connection between the asymptotically 
delay optimal control and the throughput optimal BP algorithm 
in Section III-D, we now propose a general QSI-dependent bias 
function to incorporate QSI beyond one hop in order to mitigate 
the myopic nature of the BP algorithm. 

We first present a general nonnegative QSI-dependent bias 
function for each node n G Af and commodity c G C: 

fn ] dnk ( U ) • (19) 

fceV 4 

Here, r/f) (u) G [0,1] is the weight associated with QSI ujA at 
node n, representing the relative importance of // jv in the bias 
at node n. Note that in general, r/'fl (u) is allowed to depend 
on the global QSI u. The parameter zP > > 0 is designed to 
guarantee network stability. The proper choice of >0 

(c) 

will be discussed below in Theorem 1. We can treat -f- as a 

z k 

normalized version of Later, we shall see that the quantity 
uiX + fn\u) can be regarded as a tractable approximation of 
^ (u) (cf. (21) and (14)) in the asymptotically delay optimal 
policy in (13). 

While the bias function fP( u) in (19) is generally written as 
a function of the global QSI, one can choose the bias function 
to depend only on the local QSI within one hop as follows: 

fP(*P)= E ^ K C) ) ^ (20) 

k£{k:(n,k)£C<- c '>} Z k 

where u4 = ( , up)(n,k)^c , ~ c ) ' s the local QSI within one hop. 

Each specific choice of a bias function f = (fP) cor¬ 
responds to one enhanced BP algorithm (described in the 
next subsection), and the amount of QSI contributing to the 
bias function determines the implementation complexity of 
the corresponding enhanced BP algorithm. The form of the 
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bias function (cf. (19) and (20)) is carefully chosen to en¬ 
able (generalized) throughput optimality (Theorem 1) of the 
enhanced BP algorithms, while at the same time offering a 
high degree of flexibility in choosing specific enhanced BP 
algorithms with manageable complexity, distributed implemen¬ 
tation, and significantly better delay performance. In Section V, 
we shall describe two enhanced BP algorithms resulting from 
two specific choices for f in (20) and (19), respectively, which 
embody the desirable properties described above. 

C. Enhanced BP Algorithm 

We now present a new class of enhanced BP algorithms 
by incorporating the general QSI-dependent bias functions f 
defined in (19) into the BP backpressure calculation. 

Algorithm 1 (Enhanced BP): Let a set of bias functions f y 
0 be given. 6 At each slot t, the network controller observes 
the network layer QSI U(t) and the topology state S(t), and 
performs the following resource allocation and routing actions. 

Resource Allocation: For each link (a, b) £ £ and com¬ 
modity c £ C, let 7 

wi c) (u (*)) 

= + /i c) (U(t))) - (uj; c \t) + / b (c) (U(t))) (21) 

denote the enhanced BP backpressure of link (a, b) w.r.t. com¬ 
modity c. Let c* b (U(f)) = argmax ce{c:(a b) 6 jC( c) } 

and W* fe (U(t)) = (w'^ :b(U(t))) (U(t)}) + . Choose the re¬ 
source allocation action as follows: 

I{t) = argmax ^ W* b (\J(t))R ab (S(t), I). (22) 

e ( a,b)eC 

Routing: For each link (a, b) £ £ and commodity c £ C, 
choose: 

R ab (S(t), 1(f)), W* ab (t) > 0 and c = c* b (f) 

0, otherwise. 

Note that based on fi(t), we can choose routing actions 
i/(f). For each node a £ M and commodity c £ C, if 
£ be7 V 7 ( => f/° b {t)A < Ua C \t), we choose routing action 

Vabi*) = Bab ( t ) for all b £ A/£> |0 , where Af^\ a = {b : 
(a,b) £ £ (c) }. When £ b p$(t )A > UP (t), there 

are multiple choices for u(t), which can guarantee generalized 
throughput optimality shown in Theorem 1. 

6 The notation >-, >-, -< indicate component-wise >, <, >, <. 

7 A QSI-independent bias, such as the shortest-path bias used in [9], can 
easily be incorporated into bias functions. It can be verified that, with any extra 
QSI-independent bias. Theorem 1 (Theorem 2) still holds with a constant shift 
of B in (24) (B in (40)) [9], 


D. Performance Analysis 

As mentioned above, the traditional BP algorithm can have 
poor delay performance in networks with light to moderate 
traffic loads. Note that in this situation, there exists a significant 
margin between the arrival rate vector and the boundary of the 
network stability region. Algorithm 1, with the bias functions 
chosen according to (19), can exploit this margin (or a lower 
bound on the margin) to incorporate QSI beyond one hop, 
thereby reducing the average delay while maintaining a general¬ 
ized notion of throughput optimality. This result is summarized 
as follows. For notational simplicity, we assume A = 1. 

Theorem 1 (Generalized Throughput Optimality of Alg. 1): 
Given e = ( e A 0 such that A + e £ int(A), there exist 
S = (5^) >- 0 such that A + e + S £ A, and z = (zP) y 0 
such that e z = ^ 2Rm “ ) [( ) ^ A e. Then, the queueing network 
under Algorithm 1, with the bias functions chosen according 
to (19), satisfies: 

limsu P7E E nut\r)}<^ (23) 

r=0nGA/,cGC 

where 

B E ((/OJ 2 + (A„, max + < max ) 2 ) (24) 

n^J\f 

/3 Z = sup min 

{(e,<5):e^e z ,5^0, neJV,c€C 
A} 

(25) 

With M™max - S ^seS,lGxJ2beAr R nb{sJ), /r“ max = 
SU PsGS,/Gl SaGW R an( S i I)’ anC l ^n,max = ScGC ^".max- 

Proof: Please refer to Appendix E. ■ 

Remark 1: Theorem 1 should be interpreted as follows. 
When it is given that the arrival rate vector A is bounded away 
from the boundary of the stability region A by at least e >- 0, 
i.e., A + e £ int(A), one can choose a finite z >- 0 such that 
e z -< e. In this case, the limiting average total queue size under 
Algorithm 1 is upper bounded as in (23). Thus, Algorithm 1 
stabilizes the network for any arrival rate which is bounded 
away from the boundary of the stability region by at least e, 
for any given e >- 0. When it is only known that A £ int(A) 
and no extra margin is given (e = 0), then by Theorem 1, Zn 
must be chosen to be infinity for all n £ J\f and c £ C (i.e., 
f = 0). In this case. Algorithm 1 reduces to the traditional BP 
algorithm, and Theorem 1 reduces to the traditional throughput 
optimality result (Lemma 4.1 in [8]). 

The margin (or a lower bound on the margin) e in Theorem 1 
may be obtained in several possible ways. First, traffic mea¬ 
surement is usually performed for practical networks (e.g., at 





different times of a day). The difference between the measured 
peak traffic load during the busy traffic hour (assuming the 
network remains stable) and the load during an off-peak non¬ 
busy traffic hour can serve as a lower bound on the margin 
during the non-busy traffic hour. Second, in the case where 
the arrival rate vector, or an upper bound on the arrival rate 
vector, can be estimated, one can calculate a lower bound on 
the margin by solving a linear program (in a distributed manner) 
maximizing min ng // ;Cg c subject to A + e £ int(A), where 
A is characterized in [8, pp. 32], 

Note that the choice of z based on e as given by The¬ 
orem 1 for throughput optimality may not be optimal from 
the viewpoint of improving delay performance. In Section VII, 
we shall show using numerical simulations that the proposed 
enhanced BP algorithms with appropriately chosen parameters 
z achieve significant delay improvement over the existing BP- 
based algorithms under small or moderate loading. 

V. Two Bias Functions 

In this section, we describe two enhanced BP algorithms 
resulting from two specific choices for f. These enhanced BP 
algorithms are designed to ameliorate the myopia of the tradi¬ 
tional BP algorithm, thereby improving its delay performance. 
In addition, the enhanced BP algorithms can be implemented 
in a distributed manner with manageable complexity. 


Algorithm 

Computational Complexity 

Signaling Overhead 

BP (BPbias) 

€>{N'*C) 

O(N^C) 

BPnxt (BPnxtbias) 

otWc) 

otWc) 

BPmin (BPminbias) 

owe) 

owe) 

BPSP 

0(AT 4 C) 

0(N' A C ) 


TABLE I: Comparisons on algorithm complexity. 


A. Minimum Next-hop Queue Length Bias (BPnxt) 

We consider a local QSI-dependent bias function which is 
an example of (20) and allows the resulting enhanced BP 
algorithm to incorporate QSI one more hop beyond what is 
accounted for in the traditional BP algorithm. Specifically, 
consider the minimum next-hop queue length bias function 
defined as follows. For each node n € Af and commodity c G C, 
let H* (c) (uL e) ) 4 min feg | fc:(rife)e£(o) |4 C) be the minimum 

next-hop queue length, and choose r}';' n l f ) as follows: 


(c) 

Vnk 

«) = (^ 

U (Z \ 

5 A, tz VV nxt n 

(26) 

v ’ lo, 

otherwise 


where A = j 

k : 4 C) = H* n {c) | 

^uL c) ) , (n, k) G £ (c) ] 

> and 


= l-ACtfnl- For an y § iven margin e = (4 c) ) >- 0, we 


choose z = (Zn ■*), Zn = -2 for all n G Af and c G C, where 


- (c) • y ’ 

mm ngA r, ce c £n 

Here, di n denotes the largest node in-degree among all nodes 
in the graph. Thus, the minimum next-hop queue length bias 
function is given by: 

fi c) (ui c) ) = \h*V> (u< c) ) . ( 28 ) 

Given the bias function in (28), and using Algorithm 1, we 
now obtain an enhanced BP algorithm, which will be referred 
to as BPnxt. We show in Appendix F that for all z satisfying 
(27), BPnxt stabilizes the network for any A satisfying A + e G 
int(A). 

We now briefly discuss the implementation complexity of the 
BPnxt algorithm, as illustrated in Table I. Since the difference 
between the traditional BP algorithm and the BPnxt algorithm 
lies in the backpressure calculation, we focus only on the 
complexity for implementing (21). Consider the computational 
complexity first. For each commodity, each node needs to 
compute the minimum next-hop queue length bias, which 
involves a minimization over no more than N queue lengths, 
and the summation of its queue length and the minimum 
next-hop queue length bias. In addition, for each commodity, 
each node needs to carry out one subtraction to compute the 
enhanced BP backpressure of each outgoing link, involving 
in total no more than N operations for no more than N 
outgoing links. Thus, the total computational complexity for 
the backpressure calculation of BPnxt (over N nodes and C 
commodities) is 0(N 2 C). It is easy to see that this is the 
same in order as the computational complexity for calculating 
the backpressure in the traditional BP algorithm and the BPbias 
algorithm in [9]. Next, consider the signaling overhead for the 
backpressure calculation of BPnxt. During the signaling phase 
of each slot, for each commodity, each node needs to report its 
queue length and the sum of its queue length and its minimum 
next-hop queue length bias (which can be obtained from the 
information reported to the node from its next-hop neighbors) 
to no more than N previous-hop neighbors. Thus, the total 
signaling overhead is 0(N 2 C). Again, it is easy to see that this 
is the same in order as the signaling overhead for calculating 
the backpressure in the traditional BP algorithm and the BPbias 
algorithm. In summary, the BPnxt algorithm has the same order 
of implementation complexity as the traditional BP algorithm 
and the BPbias algorithm. 

B. Minimum Downstream Sum Queue Length Bias (BPmin) 

Next, we consider a global QSI-dependent bias function 
which is an example of (19) and incorporates multi-hop QSI 
downstream toward the destinations of the respective traffic 
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commodities. Consider the minimum downstream sum queue 
length bias function defined as follows. For each node n £ Af 
and commodity c £ C, let Mn' denote the number of paths 
from node n to the destination node of commodity c, dest{c). 
Let Vn)n denote the set of nodes on the m-th path from node 
n to node dest(c), where m = 1, • • • ,Mn°\ The sum queue 
length excluding node n on path Vn)m is given by (u) = 


^iev^’m^n i ' 71 ,M^ C> n ’ m 

be the sum queue length excluding node n on the shortest path 
(in terms of the sum queue length) to dest(c). For each node 
n € Af and commodity c £ C, choose vffl (u) as follows: 


II 

si 

J AT* (C) : 

< mm, n 

h £ V* V* £ Af*^ 

^ ^ * /v min,n 

(29) 

l 0 ’ 

otherwise 


where 




/w**( c ) A fp(c) . 
J v min,n 1 r n,m ’ 

T (c ) ( u ) 
n,m V J 

= T n *S) (u), me {!,••• 

,M^}} 


is the set of shortest paths (in terms of the sum queue length) 
from n to dest(c) and N*^j n = ra |. As in BPnxt, for any 

given margin e = (e^) y 0, we choose z = ( zi°' > ), z„^ = z 
for all n £ Af and c £ C, where z satisfies (27). Thus, the 
minimum downstream sum queue length bias function is given 
by: 

fi c) (u) = - T : (c) (u) . (30) 

z 

Given the bias function in (30), and using Algorithm 1, we 
now obtain an enhanced BP algorithm, which will be referred 
to as BPmin. We show in Appendix F that for all z satisfying 
(27), BPmin stabilizes the network for any A satisfying A + e e 
int(A). 

We now discuss the implementation complexity of the BPmin 
algorithm, as illustrated in Table I. Again, we focus only on 
the complexity for implementing the backpressure calculation 
given in (21). Consider the computational complexity first. The 
downstream sum queue length minimization is a shortest path 
problem where the per-link cost is the instantaneous queue 
length at the receive node of the link. Thus, the minimization 
can be solved in a distributed and parallel manner using the 
iterative Bellman-Ford algorithm [18]. In each iteration, for 
each commodity, each node needs to compute no more than N 
summations and one minimization over no more than N alter¬ 
natives. The number of iterations is no more than N. Thus, for 
each node and each commodity, the computational complexity 
of calculating the minimum downstream sum queue length bias 
is 0(N 2 ). In addition, for each commodity, each node needs to 
carry out the summation of its queue length and the minimum 
downstream sum queue length bias, and then carry out one 
subtraction to compute the enhanced BP backpressure of each 


outgoing link, involving in total no more than N operations for 
no more than N outgoing links. Thus, the total computational 
complexity for the backpressure calculation of BPmin (over N 
nodes and C commodities) is 0(N 3 C), which is lower than 
that of the BPSP algorithm in [10] (0(N 4 C)), but higher than 
that of the traditional BP algorithm and the BPbias algorithm 
( 0(N 2 C )). Next, consider the signaling overhead. For each 
commodity and each iteration of the Bellman-Ford algorithm, 
each node needs to report one intermediate shortest path length 
to no more than N previous-hop neighbors. When the Bellman- 
Ford algorithm converges within no more than N iterations, 
each node needs to report the sum of its queue length and 
the minimum downstream sum queue length bias to no more 
than N previous-hop neighbors, for each commodity. Thus, 
the total signaling overhead is 0(N 3 C), which is the same in 
order as that of the BPSP algorithm, but higher in order than 
that of the traditional BP algorithm and the BPbias algorithm 
( 0(N 2 C )). In summary, the BPmin algorithm has a higher 
order of implementation complexity than the traditional BP 
algorithm, the BPbias algorithm and the BPnxt algorithm, but 
a lower order of implementation complexity than the BPSP 
algorithm. 

VI. Enhanced Joint Flow Control and BP 
Algorithms with Arbitrary Arrival Rates 

In this section, we consider the case where A ^ A, and 
develop a new class of enhanced joint flow control and BP 
algorithms. 

A. Transport Layer and Network Layer Queue Dynamics 

When A ^ A, the network cannot be stabilized by any 
feasible resource allocation and routing policy. Rather, in order 
to stabilize the network, a flow controller must be placed in 
front of each network layer queue at the source nodes to control 
the amount of data admitted into the network layer. Newly 
arriving data first enter transport layer storage reservoirs before 
being admitted to the network layer [8], Let Qn}m ax and Qn\t) 
denote the transport layer buffer size and the QSI of commodity 
c data (bits) at node n at the beginning of slot t, respectively. 
The buffer size Qn\n ax can be infinite or finite (possibly zero). 
Similarly, for each commodity c £ C, we set Q„/ max = 0 
and Qn\t) = 0 for all t, if node n is the destination node 
of commodity c. Let rif 1 (t) A > 0 denote the amount of data 
admitted to the network layer queue of commodity c data (bits) 
at node n from the transport layer queue during slot t. Thus, 
we require Tn\t )A < Qn\t). We assume rn\t) < ax, 
where rij rlax is a positive constant which limits the burstiness 
of the admitted arrivals to the network layer [8], For each 
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commodity c £ C and node n £ J\f, n dest(c), we have 
the following transport and network layer queue dynamics: 

Q ( n C \t + 1) (31) 

= min {QW (t) - rW (t) A + (f)A, Q^ max } 

+ (32) 

< (W>(f) - E +rl c) (f)A+ E Mi c 2(*)A. 

V &GJV" / aGA/" 


B. Enhanced Joint Flow Control and BP Algorithm 

The goal of the flow control is to support a portion of 
the traffic demand A which maximizes the sum of utilities 
when A ^ A. Let hn\-) be the utility function associated 
with the input commodity c data at node n. Assume hn \•) is 
non-decreasing, concave, continuously differentiable and non¬ 
negative. Define a 0-optimal admitted rate as follows: 

r*( 0 ) = argmax E E 5 (H c) ) (33) 

n£Sf,c£C 

s.t. r + 6 £ A (34) 

0 A r X A (35) 

where r *(0) = (ff*n C \d)), r = (r„^) and 0 A 6 A ( 6 E £ A. 

The constraint in (34) ensures that the admitted rate to the 
network layer is bounded away from the boundary of the 
network stability region by Q. Due to the non-decreasing 
property of the utility functions, the maximum sum utility over 
all 6 is achieved at r*( 0 ) when 0 = 0 . 

We now develop a new class of enhanced joint flow control 
and BP algorithms that yield a throughput vector which can be 
arbitrarily close to the optimal solution r*(0). Following [ 8 , 
pp. 90], we introduce the auxiliary variables and the 

virtual queues Y„^ (f) for all n £ M and c £ C. 8 

Algorithm 2 (Enhanced Joint Flow Control and BP): Let a 
set of bias functions f ^ 0 be given. In each slot t, the 
controllers observe the network layer QSI U(f), virtual QSI 
Y(f) and topology state 5(f), and performs the following flow 
control, resource allocation and routing actions. 

Flow Control: For each node n £ N and commodity c £ C, 
the flow controller observes the transport layer QSI Qn (t) and 
the virtual QSI Yn C \t), and chooses the admitted data rate at 
slot t, which also serves as the output rate of the corresponding 
virtual queue: 

r (C \t) — / min ' (*)/A, rifmaxj ! Yn C \t) > Un\t) 

[O, otherwise. 


8 Note that the flow control part of Algorithm 2 is the same as that in the 
traditional joint flow control and BP algorithm in [8, pp. 90]. The difference 
lies in the resource control and routing part. We describe the flow control part 
here for the purpose of completeness. 


The flow controller then chooses the auxiliary variable, which 
serves as the input rate to the corresponding virtual queue: 

7 (c) (t) = arg max Mh$ ( 7 ) - Y„ (c) (t) 7 A (36) 

7 

s.t. 0 < 7 < r^ max 

where M > 0 is a control parameter which affects the utility- 
delay tradeoff of the algorithm. Based on the chosen Tn\t) 
and 7 n\t), the transport layer QSI is updated according to 
(31) and the virtual QSI is updated according to: 

Y,i c) (t + 1 ) = (E c) (t) - rtf (t) a) + + 7 W (t) A (37) 

where kE(0) = 0 for all n £ H and c £ C. 

Resource Allocation and Routing: Same as Algorithm 1. 


C. Performance Analysis 

The following theorem summarizes the utility-delay tradeoff 
of Algorithm 2. For notational simplicity, we assume A = 1. 

Theorem 2 (Utility-Delay Tradeoff of Alg. 2): For an arbi¬ 
trary arrival rate vector, for any transport layer buffer size, and 
for any control parameter M > 0, given e = (ei c) ) £ int(A), 
there exist 6 = ( 6 „•*) >- 0 such that e + 6 £ A, and 
z = (zn ' > ) >- 0 such that e z = ) A e. Then, the 

queueing network under Algorithm 2, with the bias functions 
chosen according to (19), satisfies: 


r 1 \ ' ror rr(c)f m ^ AT? + MfT max 

lim sup - 2_^ E E \. U n( T )\< - 1 - (38) 

t_>0 ° T=0neA/>eC A* 

Hminf E h n (&(*)) > E h n ] (^ (c) (e.)) 

nGA£cGC nGA/",cGC 


NB 

~M 


(39) 


where 


B = ^E ((^Eax) 2 + (r n ,max + < max ) 2 + 2(r„, max ) 2 ) 


■te M 


K = 


sup mm <; el c) + 6 ( c) - 

{(e,<5):eVe z ,<5^0, n ^ N ^ C 
€+<?£ A} 


with r — Y TJ — 

wmi ' n,max — E-/c£C ' n > max ’ 11 max — 

E„GA r,cec h ^ ] ( r n, max), and r£ c) (f) = 7 EEo E [ r « ) ( T )]- 
Proof: Please refer to Appendix G. ■ 

Remark 2: Theorem 2 should be interpreted as follows. 
When 0 -< e £ int(A), one can choose a finite z >~ 0 such 


2 ft Tf c ) 

- 1 1 'max 17 


(c) 

Zn 


(40) 


(41) 
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that e z A e. In this case, the limiting average total queue 
size under Algorithm 2 is upper bounded as in (38). At the 
same time, the limiting sum utility is lower bounded as in (39), 
thereby specifying a utility-delay tradeoff. In Section VII, we 
will demonstrate numerically that Algorithm 2 indeed yields 
a better utility-delay tradeoff than the traditional flow control- 
BP algorithm in [8, pp. 90]. When e = 0, i.e., no margin is 
given, Zn ^ is chosen to be infinity for all n £ Af and c € C (i.e., 
f = 0). In this case. Algorithm 2 reduces to the traditional flow 
control-BP algorithm, and Theorem 2 reduces to the traditional 
utility-delay tradeoff result (Theorem 5.8 in [8]). 



Fig. 1: Network topology and commodities [10]. di n = 5 and L = 
224. 


VII. Numerical Experiments 

In the numerical simulations, we consider the same simula¬ 
tion setup as in [10] for ease of comparison. Specifically, we 
consider a network with 64 nodes and four clusters as shown in 
Fig. 1. Each cluster is a 4 x 4 regular grid with two randomly 
added links. Two adjacent clusters are connected by two links. 
All links are bidirectional with a maximum transmission rate 
of one packet/slot for both directions. We consider the wireline 
case, in which all links can transmit simultaneously. As in 
[10], we consider 8 commodities corresponding to the fol¬ 
lowing source-destination pairs: ((1,3), (2, 5)), ((2,3), (2, 7)), 
((2,2), (1,6)), ((3,4), (2,7)), ((1,1), (1,7)), ((4,3), (5,4)), 
((4,6), (6,6)) and ((5, 3), (5,6)). The packet arrival processes 
are Poisson. We compare the performance of our enhanced 
BP algorithms discussed in Section V, i.e., BPnxt and BP- 
min, and the shortest path biased versions of our enhanced 
BP algorithms, i.e., BPnxtbias and BPminbias, 9 with several 
baseline schemes, such as the traditional BP algorithm [7], 
[8], the BPbias algorithm [8], [9], and the BPSP algorithm 
[10], In the simulations, we use the average number of packets 
in the network as the performance measure, a quantity which 
is linearly related to the average delay by Little’s Law. The 
average performance is evaluated over 10 5 time slots. 

A. Enhanced BP Algorithms 

Figures 2, 3 and 4 show the average number of packets in 
the network versus the arrival rate in the light and moderate 
loading regimes. Here, all commodities have the same arrival 
rate, i.e., Arf' 1 = A for n = src(c) and c € C, where src(c) 
denotes the source node of commodity c. First, from Fig. 2, 
we can see that with the minimum next-hop queue length bias 
and the minimum downstream sum queue length bias, the delay 
performance of the traditional BP can be improved, by using 
BPnxt (z = 1,2,5) and BPmin (z = 1,2, 5), respectively. It can 

9 As in BPbias, we add two QSI-independent shortest path bias terms B a 
and B/, to the instantaneous QSI at nodes a and b in (21), where B a and Bf, 
are parameterized by the pel-link cost B. 
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Fig. 2: Delay of BPnxt and BPmin at z = 1, 2, 5. 


be verified that arrival rates A < 2.5 can be stabilized [8, p. 32]. 
Thus, when A < 0.5, z = 5 satisfies the sufficient condition 
in (27) for throughput optimality of BPnxt and BPmin. On 
the other hand, as discussed in Section IV-D, the choice of z 
satisfying (27) is not necessarily optimal for delay performance. 
From Fig. 2, it can be seen that the delay for BPnxt (BPmin) 
with z = 1 is at most 28.7% (12.1%) of the delay for BP for 
A = 0.1, ,0.6. When z increases, the delay performance 

gain of BPnxt (BPmin) over BP decreases, as the effect of 
the queue length bias reduces. In addition, we see that the 
delay performance of BPbias and BPSP is very sensitive to 
the choices of parameters B and K, where B is the per- 
link cost in obtaining the shortest path bias and K is the 
control parameter for traffic splitting. Specifically, when B and 
K are small, the delay performance improvement of BPbias 
and BPSP over traditional BP is not significant. When B 
and K are large, as compared with traditional BP, BPbias 
and BPSP have significantly lower delay in the small traffic 
loading regime. On the other hand, in the moderate (and heavy) 
traffic loading regime, a small delay reduction or even a delay 
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Fig. 3: Delay of BPnxtbias and BPminbias at 2 = 1. 


Fig. 5: Utility-delay tradeoff of BPnxt and BPmin at z = 1, 2, 5. 
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Fig. 4: Delay of BPnxtbias and BPminbias at 2 = 5. 


Fig. 6: Utility-delay tradeoff of BPnxtbias and BPminbias at 2 = 1. 


increase is observed. This is because BPbias and BPSP, by 
improving delay performance using the shortest path concept, 
may easily cause heavy congestion on the shorter paths when 
B and K are large or when the traffic load is high. Thus, it 
is difficult in general to determine beforehand the proper B 
and K parameters. In contrast, for small z, the delay of the 
proposed BPnxt and BPmin algorithms, which improve delay 
performance by exploiting (dynamic) downstream congestion 
information, is small across the light and moderate loading 
regimes. 

Next, from Fig. 3 and Fig. 4, we can see that with the 
additional minimum next-hop queue length bias and minimum 
downstream sum queue length bias, the delay performance 
of BPbias (B = 1,2,10) can be further improved, under 
the same parameter B. This supports our conjecture that by 


considering more QSI, we can substantially improve the delay 
performance of BP-based algorithms. Similar to BPbias, the 
delay performance of BPnxtbias and BPminbias is also sensitive 
to the choice of B. However, with small B and z, BPnxtbias 
(BPminbias) can achieve good delay performance across the 
light and moderate loading regimes. For example, the delay of 
BPnxtbias (BPminbias) at z = 1 and B = 1 is at most 11.2% 
(4.1%) of the delay of BP for A = 0.1, • • • , 0.6. 

B. Enhanced Joint Flow Control and BP Algorithms 

Fig. 5 and Fig. 6 illustrate the average number of packets 
in the network versus the sum utility of the admitted rate 
over all commodities. We consider proportional fairness by 
choosing the logarithmic utility function. Specifically, for each 
commodity c £ C, we choose hn\x) = log(x) for n = src(c) 
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and hn\x) = 0 for all n ^ src(c). We choose = 3 
and rii'lnax = 1 for n = src(c). The utility-delay tradeoff 
curve is obtained by choosing different control parameters M. 
As in Section VII-A, we see that the utility-delay tradeoff for 
BPnxt and BPmin are significantly improved when compared 
with that of traditional BP. In addition, the performance for 
BPnxtbias and BPminbias are very competitive when compared 
with BPbias and BPSP. 

C. Discussion 


Appendix A: Proof of Lemma 1 
First, define a real valued function 


h(s, u) = min ! ^ u ^ ^ P (s , u ),u'(T v)V(W) > 

nCiM u'GU 

\ cGC 


- d. 


(42) 

By (42) and (10), we have d + V(u) = ^^Pr^ = 
s](/i(s, u) + d), implying: 


Algorithm 

Normalized Simulation Time 

BP (BPbias) 

i (i.i) 

BPnxt (BPnxtbias) 

1.8 (1.9) 

BPmin (BPminbias) 

12.6 (12.7) 

BPSP 

110.5 


TABLE II: Comparisons on average normalized simulation time. 


Table II shows the average normalized simulation time of 
all the considered algorithms for the network topology and 
commodities in Fig. 1. We can see that the BP, BPbias, BPnxt, 
BPnxtbias, BPmin, BPminbias and BPSP algorithms are in 
increasing order of complexity. When only low computation 
cost is acceptable, BPnxtbias, i.e., the combination of a small 
shortest path bias (i.e., small B ) and the minimum next-hop 
queue length bias with small parameter (i.e., small z), seems 
to result in delay performance close to or better than that 
of BPbias and BPSP across the light and moderate loading 
regimes. The small shortest path bias captures essential path 
length information, and the minimum next-hop queue length 
bias with small parameter z captures essential congestion 
information on each path. Both bias terms help to correct 
the myopic nature of the traditional BP algorithm. When high 
computation cost is acceptable, e.g., in small networks, we can 
consider BPmin with small z or BPminbias with small B and 
z for further delay performance improvement. 

VIII. Conclusion 

In this paper, we show that the asymptotically delay optimal 
control resembles the BP algorithm in basing resource alloca¬ 
tion and routing on a backpressure calculation, but differs from 
the BP algorithm in the form of the backpressure calculation 
employed. Motivated by this connection, we introduce a new 
class of enhanced BP-based algorithms which incorporate a 
general queue-dependent bias function into the BP backpressure 
term to substantially improve delay performance. We demon¬ 
strate the throughput optimality and the utility-delay tradeoff 
for the proposed algorithms. We further elaborate on two 
specific algorithms within this class, which have demonstra¬ 
bly improved delay performance while maintaining acceptable 
implementation complexity. 


V(u) = ^PrfS 1 = s]h(s,u). (43) 

s£<S 

Substituting (43) into (42), we have: 


d + h(s, u) 


( fl ) • 
= mm 
r> 


E U< n + E P (s,u),( S ',u 


n£Af 

cec 


>, (44) 


where (a) is due to (8). In addition, we have 

E[fc(S(t),U(t)|(S(0),U(0)) = (s,u), W ] ® £ u ,Pr[U(f) = 

u'|(S(0),U(0)) = (s, u),u;] x (£ s , Pr [S{t) = s']h(s', u')) ( = } 
E[V(U(f))|(5(0),U(0)) = (s,u),w], where (b) is due to the 
i.i.d. property of S(t ) and (c) is due to (43). Thus, by (11), 
we have: 

lim -E[/i(5(f), U(f)|(5(0), U(0)) = (s,u),wl =0. (45) 

£—>•00 t 

Note that (44) and (45) correspond to conditions (4.121) and 
(4.122) of Proposition 4.6.1 in [6, pp. 254]. In addition, S xU 
is countably infinite and I x 1Z is finite. Thus, by Proposition 
4.6.1 [6, pp. 254], we can prove Lemma 1. 


Appendix B: Proof of Lemma 2 
Let u' = U(t + 1), u = U(t), An' 1 = An\t) and 
u ab ~ u ab(t)- vecto r form of (3) can be written as u' = 

u ~ (EhGAf u nb A ) + ( A n )A ) + (EaeAA A ) ■ B y Taylor’s 
theorem for multivariate functions, we have V ( u') = V (\\) + 

A EnGAf.cGC P n,(c)( U ) ^ + EaGAf v an — U nb} + 

o(A) [17]. Thus, we have: 

E P (s,u), n'( I ^)V(u l ) = V(u) + A J2 K,(c)(u)Ai C) 

- A E E^ ) (K(c)( u )-K( 0( u ))+°( A )- (46) 

(. a,b)ec cec 

By (12) and (46), we have: 

w*(s,u) = arg min ^ P (SjU) , u -(/, iz)L(u') = 

u ' £U 
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arg min I 

-A 

E 

E 

(c) 


)(«)- Vfc,(c)(' 

u)) + o(A) 

Lv V 

C 

a,6)6£ 

cGC 












(47)' 

Let w*( 

s,u) 


= 


(I\ 

^), wt(a, 

u) 


T l 

s,u)B 



_A 

„ (c) , 

EnGAf.cGC Un + 

Es'.u' P(s-; 

»)>(«'■ 

,u')(X 

‘.L 



and T7 

(s 

u A 

h = 

> u ) 

SneAf,cGC 

( c ) 

Wn 

+ Ei 

>' ,u' 


u),(s',U 


, u'), where 


h = (h(s,u)). By (46) and (47), we have: 


For this equality, the R.H.S. is a function of u and the L.H.S. 
'is a function of u k . Thus, the above equality cannot hold. By 
contradiction, we can show that for some n £ Af and c £ C, 
there does not exist a function gn\ulf*), such that V' n ( c )( u ) = 

gn\un' > ). Thus, we can always find u £ U, such that c^ b (u) ^ 
cL(u) and A(u) ^ <51/^(ir) for some (a, b) £ £, where 
c^ b (u) and are defined in a similar way to c^ h (u) and 

u) but with tin 1 instead of V~' n ( c )( u )- Therefore, we can 
find some (s,u)eSxii such that w'(s, u)^ u;*(s, u). 


u ' £U 

= E P(s,»),u'(l\^)V(u') + o(A) (48) 

u ' £U 

%,,u)h = T ( t s>u) h + o(A) 

^d(A) + h{s , u) = XE) h + o( A) 

=>T^ s u) h = d(A) + h(s,u) + o(A), (s,u) £ S xU, (49) 

where (a) is due to (43) and (8), and (b) is due to (44). As in the 
proofs of Proposition 4.1.6 in [6, pp. 191] and Proposition 4.6.1 
in [6, pp. 254], from (49), we can show d^(A) = d(A) + o(A) 
as A —> 0. 


Appendix D: Proof of Lemma 4 

Let w J (s,u) = (I*, v*) and T? ,h A + 

a ’ cGC 

By (46), (13) and (17), 


: know E u) h > T ( E)h + o( A) for all (s,u) 
addition, by Lemma 3, we know that there e 


we 

In addition, by 
and (s, u) £ S x U, such that t7 


i ih > T , 
(s.u) — (s,u) 


£ S XU. 
exist e > 0 
h + o(A) + e. 


Combining (49), we have for all (s,u) £ S xU, T^ s u ^h > 
d(A) + h(s 7 u) + o(A), and for some (s, u) £ S x U, 
E u) h > d( A) + h(s,u) + o(A) + e. As in the proofs of 
Propositions 4.1.6 in [6, pp. 191] and 4.6.1 in [6, pp. 254], we 
can show S( A) > d( A) + o(A) + e. Thus, by Lemma 2, we 
have d$( A) — dt(A) > e + o(A) as A —x 0. 


Appendix C: Proof of Lemma 3 
Let wt(s,u) = (/'*’, i/t). By (10), (46) and (48), for all u £ 
U, we have: 

d = E U n ) + A E y n,(c)( U )^,W(u)+0( A) 

nGA/\eGC n€A/",c€C 

(50) 


where 


- n,(c) 


» = a( c 


sG«S 


Pr [S = s} 


V !/Wt 

/ ^ an 


E ^ c)t 

bGAf 


nb 


Suppose for all n £ Af and c £ C, there exists a function 
gn\un ) ), such that fA i(c )(u) = gn\u ( a ) ). By (13), we know 
that F n ,( c )(u) is still a function of the global QSI u. In addition, 


by (50) and T^ (c) (u) = gn\u E), we have: 

d - +o(A) 


flfc,( C ')(E c) ) =- 


cec 


Aft,(c')( u ) 


^ ^2 n£j\f,c(E.C 9n,{c){^n )-^n,(c)(u) 
(n,c)^(fc,c / ) 

AF fei(c /)(u) 


10 Equality (48) is due to the following. Let fi (,r) and / 2 (x) be two functions 
of x. Let x * = arg rnin ; ,(/'i (x) + and x* — arg rnirr, fi(x). Then, 

we have /i(aA) + / 2 (i*) < /i(x*) + h(x*) < /i(a; + ) + h(x^). 


Appendix E: Proof of Theorem 1 
Define the Lyapunov function L(u) = c gc( m E 2 - 

The Lyapunov drift at slot t is A(U(f)) = E[L(U(f + 1)) — 
L(U(£)) |U(£)]. Squaring both sides of (18) and following steps 
similar to those in [8], we have: 11 

L (U(f + 1)) — L (U(f)) 

<2NB + 2 E Ui c \t)A^(t)-2 E E^W 

nGAf.cGC (a,b)eCcGC 

X ((E C) W + /i c) (U(£))) - (E C) W +/ 6 (c) (U(f)))) 

+ 2 E T,^ab(t)(fi c) m))-/b c) m))) 

(a,b)eC cGC 

( <2AS + 2 E UP\t)A^(t)-2 E E^W 

nGAf,cGC (a,fe)G£cGC 

X ((cf'(t) + A c) (U(t») - (cfm + /, M (U(t»)) 

+ 2 E (51) 

nGAf,cGC 

where (a) is due to the following: 

E E w (/i c) (u(*)) - fi c) ( u w)) 

(a,6)G£ cGC 

11 Note that denotes the action of Algorithm 1. 
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(a,b)&C cec fee AT 


<E E ^E-r^’w 

cec (a,b)e£(<o feeA^ z fe 

= E PPwEl 

neA/",ceC ~n 


(52) 


Taking conditional expectations on both sides of (51), we have: 

A (U(t)) 

( 6 ) 


< 27VB + 2 E ^ C) W^ C) -2E 

neA/",ceC 


E E^oE) 

L(a,fe)e£ cec 


the upper bound over all possible (e, 5), we complete the 
proof of Theorem 1. 

Appendix F: Proof of Sufficient Condition for z 

Using the proof of Theorem 1, we replace (52) with (56) to 
show that for all z satisfying (27), BPnxt stabilizes the network 
for any A satisfying A + e G int(A): 

E E^w^’K’^-^’K’' 

(a,6)££ cGC 

< e ezv>E^ 

(a,b)6£ cGC 


X ((^i c) (t) + /i c) (U(t))) - (EE) + /E(U(f)))) |u (t) 


< E EEco 


. EE*) 


r r,( c ) 

1 ••max 17 


+2 e EE*) 

neA^.ceC 

= 27V73 + 2 E EE*) A E 

neA^.ceC 


E Efc (*) - E ^on (*) ) U (*) 

fceJV aeTV / 


-2 E EE*)® 

neA^.ceC 

+ 2 E E E [EE*) (/ b (c) (UW) - /i c) (U(f))) |u(t) 


(a,b)e£ ceC 

+ 2 E EE*) 


N ^ 7 

(a,6)6£ cGC iV out,a^ 

/V- v p ^ fee ^uEE*) 

-Z^ " max at(c) 

ceC (a,6)e£U) Jv out,a^ 

y-y- V- o ^E^U^E*) 

-Z^Z^ Z^ ^ max Ar(c) 

c£C a£Af utzju-ic) iy out,a /C ' 

^-" /v out,a 

(c) / . \ -*^n 


ft t( c ) 

(0 

% 


(53) 


neA^,ceC 

where (b) is due to the fact that Algorithm 1 minimizes the 
R.H.S. of (b) over all possible alternative actions jl^) (t). Since 
A + e + <5 € A, by Corollary 3.9 of [8], there exists a stationary 
randomized policy that makes decisions based only on S(t) 
(i.e. independent of U(f)) such that 


EE E EE*) 

cec aeAf fegytP n 

= E EE*) 

n£j\T,c£C 

< E EE 


i? A7\^ 

Z 

■^max^in 


E 


E Anb (*) - E EE*) U(£) 

aGAf / 


= e (c) + <5 (c) + A (c) . 

c n 1 u n ' /x n * 

(54) 


On the other hand, similar to (52), we have: 

E E EAibW(/6 C) (UW)-/i c) (U(£))) |u(f) 

(a,b)eC cec 

< E (55) 

nGAf,cGC 


Substituting (54) and (55) into (53), we have 

A(U(f)) < 2NB—2 mi nrlg _A/ jCg c {e&° + } x 

J2neAT cgC EE*)- By Lemma 4.1 of [8] and by minimizing 


(56) 

z 

neAf,cec 

Here, Af^\ n = {k : ( n,k ) G £ (c) }- E«t,n - Wout,nl 

K { n,n - i k : EE e Z (c (}, Jv£j n A ITVgU and 

d in = max neA r iCeC EE 

Similarly, we replace (52) with (57) to show that for all 
z satisfying (27), BPmin stabilizes the network for any A 
satisfying A + e G int(A): 

E E EE*) (/i c) ( u w) - /E (u(*))) 

(a,b)eC cGC 

< E EE b } (f) E (c)(u w) + EE) - r; (c) (uw) 

(a,b)ec cec 

= E EcSwE 1 

(a,b)e£ cec 


<E E 

cGC (a,&)g£(<=) 
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= EE E ^ 

ceC a^M beN^X 


ul c \t ) 


r(c) 


JD j\T\ L 

= J2 E c) w Vn 

n£N,c£C 


< E E c) w 

n£Af,c£C 




A (©(f)) - ME 

E E 

Z) ( 7 { n C \t)) 

0(f) 


n£j\f,ceC 




( 6 ) 

< 


E (^n (c) W-^ c) (*))E[r^(i)|0(t) 

neA/",ceC 

E E [ M/l - C) (in\tj) ^ Y^(t)^ n C \t)\&(t) 

n&N',ceC 


12 Note that r^f 1 (t), (t) and (i) denote the actions of Algorithm 2. 


E EJ (*) - E EnW) ©w 

beAf aGAf / 


- £ u^m 

n£Af,c£C 

E E e [E bE)(/ b (c) (uw)-/i c) (uw)) |©(*) 


(57) 


(a,i>)e£ ceC 

E E c) (i) 

n£j\f,ceC 


r rX c ) 

1 t max- f - y 


(59) 


Appendix G: Proof of Theorem 2 
Define the Lyapunov function L{0) = 

|E„eAf,cGC ((E c) ) 2 + (E c) ) 2 )> where 0 = (u,y). Denote 
©(f) = (U(f),Y(f)). The Lyapunov drift at slot f is 

A(@(f)) = E[L(©(f + 1)) — L (©(f)) |© (f )]. Squaring both 
sides of (32) and (37) and following steps similar to those in 
[ 8 ], we have : 12 

L (0(f + 1)) — L (0(f)) 

<NB+ J2 uM{t)rU{t)~ J2 EEb (t) 

n£M,c£C (a,b)£Cc£C 

X + /i c) (U(f))) - (E C) W + /! c) (U(t)))) 

+ E E^^) (/i c) ( u (f)) - /6 Cc> (u(f))) 

(a,b)e£ ceC 

- E E c) (*)(E c) (*)-7^w) 

neA/'.cec 

<Eb+ £ ^ c) (f)E c) (f)- E EEE*) 

neJ\T,cec (o,6)gz;cgc 

X ((E c) w + /i c) (U(f))) - (E c) (f) + /h (c) (U(f)))) 

+ £ 

neA/'.eec ^ 

E E c) W(E c) W-7i c) W) (58) 

neAf.cec 

where (a) is due to (52). Taking conditional expectations and 
subtracting ME EneAMeC E c) En c) ( t)\ ©(f) from both 
sides of (58), we have: 


where (b) is due to the fact that Algorithm 2 minimizes 
the R.H.S. of (b) over all possible alternative fn\t), jn\t) 
and £$(*). It is not difficult to construct alternative random 
policies that choose rff (f), Tr? (f), fi^b (t) such that 


E 


f^(t) ©(f) 


= r: (c) i 


e + ^) 


E c) W=E c E + a) 


E 


E Eb (*) - E En } w 

beAf q£A/" / 

= r:^(e + <5) + e^+(5W 


(60) 

(61) 


(62) 


where r*(e + 8) = ^r^ c ^(c + 5)^ is the target (e + 6)- 
optimal admitted rate given by (33 ). 13 Equation (62) 
follows from the same arguments leading to (54). Thus, 
by (60), (61), (62) and (55), from (59), we obtain 


A(©(f)) 3/E [Ene/V.cec E ^ ^ 


0(f) 


< NB — 


min sr ■ ( I 2Jt max L (c) 1 rr( c ) />\ _ 

nim„ e yv7ceC + On ^ J MneAf.ceC Un W 

M YjneN,ceC E c) (E (c) ( e + «S)^. Applying Theorem 5.4 of 


[ 8 ], we have: 


1 


t-i 


limsup-^ ^ E[(7^ c) (r)] 

300 r=0 nGAf,cGC 

Af? + MH max 


< 


min „ /,(°) X (c) 2ii " 

min neJ V )CeC < Cn + On 


4 21 } 


(63) 


lim inf 

t—*oo 


£ e»( 


A 


^eA r.c&c 


> £ tf>(r„<‘>(e + «>)-^ 


(64) 


nGA/.ceC 


where 7 ^ c) (f) = 7 Et=o E [7^( T )]- is eas y to P rove 
7 n\t) < Tn\t) by showing the stability of the virtual queues. 
As in [ 8 , pp. 88 ], we optimize the R.H.S.s of (63) and (64) 
over all possible (e,S). Thus, we can show (38) and (39). 

13 Specifically, (60) can be achieved by the randomized policy which sets 
Vn\t) = An (t) with probability fn C ^(c + S)/Xn ^ and r^\t) = 0 with 
probability 1 — r^ c ^(e + S)/Xn . 
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